Skip to main content

Full text of "mit :: lcs :: tm :: MIT-LCS-TM-017"

See other formats



Technical Memorandum 17 

(This report was reproduced from an M.S. Thesis, MIT, 
Dept. of Electrical Engineering, September 1971.) 

Leonard I. Goodman 

September 1971 

Massachusetts Institute of Technology- 
Cambridge Massachusetts 02139 


I wish to thank Professor John Donovan, the supervisor of 
this thesis, for his encouragement, enthusiasm, and guidance 
during the research and preparation of this report. His help 
is deeply appreciated, 

I acknowledge fellow graduate student Jerry Johnson for 
the many discussions we had during the early formulation of 
this work, and Cathy Doyle for her typing of this thesis report. 
Finally, I thank my wife Hindi for her patience and encour- 
agement during my graduate study. 

Work reported herein was supported in part by Project MAC, 
an M. I. T. research program sponsored by the Advanced Research 
Projects Agency, Department of Defense, under Office of Naval 
Research Contract Number Nonr-4102 (01). Reproduction of this 
document, in whole or in part, is permitted for any purpose of 
the United States government. 

Leonard I . Goodman 


A theory of complexity is developed for algorithms imple- 
mented in typical programming languages. The complexity of a 
program may be interpreted in many different ways; a method for 
measuring a specific type of complexity is a complexity measure 
-- some function of the amount of a particular resource used by 
a program in processing an input. Typical resources would be 
execution time, core, I/O devices, and channels. 

Any resource whose use is independent of previous and future 
usage can be handled by the theory. This condition includes time 
but excludes space complexity. For a specific measure, the com- 
plexity of the basic programming elements can be determined and 
used to compute the complexity of an arbitrary program with a 
particular input. Because this method gives little information 
about the general complexity behavior of the program, another 
approach is developed. 

This new approach analyzes the complexity of a program with 
respect to a valid set of inputs -- a finite set of legitimate, 
halting inputs. A program equation is developed to make the trans- 
formations undergone by the inputs more explicit. Using the equa- 
tion, the input set is partitioned into classes of constant com- 
plexity. The classes are used to compute maximum, minimum, and 
expected complexities of the program on the input set. 

Several equivalence relations are defined, relating different 
programs by their complexity. Complexity is also discussed in terms 
of concatenation and functional equivalence of programs. 

*This report reproduces a thesis of the same title submitted to 
the Department of Electrical Engineering, Massachusetts Institute 
of Technology, in partial fulfillment of the requirements for the 
degree of Master of Science, September 1971. 

Table of Contents 

Chapter I - Introduction 6 

1. Functions^ Algorithms, Programs 7 

2. Complexity Measures 9 

3. Previous Work 10 

4. Graph >k)dels of Programs 11 

Chapter II - Complexity of Arbitrary Programs for 

Single Inputs 13 

1. Introduction 13 

2. Constraints on Complexity Measures 13 

3. Complexity of Basic Program Elements 15 

4. Complexity of Arbitrary Programs 22 

5. The Set Approach 23 

Chapter III - Program Equations and Set Complexity 24 

1. Introduction 24 

2. Input Sets 24 

3. Program Equations 25 

4. Complexity Equivalence Classes 31 

5. Conclusion of Example Program 35 

6. Summary 37 

Chapter IV - Complexity of Advanced Constructs and Input 

Schemes 38 

1. Introduction 38 

2. Subroutine and Function Calls 38 

3. LOOP Blocks 44 

4. Multiple Inputs 48 

5. Different Data Types 51 

6. String Input 52 

Chapter V - Results in the Complexity Theory of Programming 

Languages 53 

1. Introduction 5 3 

2. Preliminary Definitions 53 

3. Relations Between Programs with Identical Input 

Sets 56 

4. Concatenation 67 

5. Functional Equivalence 73 

Chapter VI - Conclusions and Suggestions for Further Study 76 

Appendix - Mathematical Notation 80 

References 84 


Chapter I. Introduction 

Within the past ten years^ there has been an increased 
interest by computer scientists and mathematicians in the theory 
of computational complexity. This theory is concerned with 
measuring the difficulty of computing functions and with studying 
the properties of measures of computational difficulty. Most 
of the work done in this field has remained within the domains of 
recursive function theory and the analysis of Turing machine 
computation of functions. (See, for example, Hartmanis and 
Hopcroft [1] for an overview of complexity theory.) One area of 
complexity theory that has not received much attention is the 
analysis of functions represented by computer programs. The 
current research is directed towards this area. 

We attempt to apply the basic principles of computa- 
tional complexity theory to algorithms which are Implemented 
In typical programming languages. One of the basic ideas of 
this complexity theory of programming languages Is to view the 
complexity behavior of a program over a finite set of 'Valid 
Inputs", rather than over some infinite domain or for only one 
Input. A 'Valid Input" Is one for which the program in question 
halts and which the program is actually intended to process. 
Looking at the complexity for all the elements in a set of this 
type will enable us to get a better picture of the complexity 
behavior of the program. We will also be able to define some 
relationships between different programs if the complexities 

of their elements relate in certain ways, 

1. Functions, Algorithms, Programs 

We have mentioned the concept of computational complex- 
ity with regard to functions^ algorithms, and programs without 
clearly defining these three terms and explaining the differences 
between them. 

A function defines an association between the objects 
of one set (the domain of the function) and the objects of 
another set (the range). The method of determining the object 
in the range set corresponding to the object in the domain need 
not be explicitly stated; a set of ordered pairs of the form 

(domain element, range element) 
completely defines a function. Even if a rule for the function 
is given (e.g., via a lambda expression [2]) the evaluation of 
the rule may remain indeterminate. For example, if we have the 

\x. x^fx*x 
do we perform the addition first or the multiplication? 

The computational complexity of a function would 
have to measure the difficulty of computing the range element 
given a domain element, no matter what rule was used to determine 
the range element (more than one rule may specify the same 
function), or how the rule was evaluated (as long as the evalua- 
tion procedure produced the correct answer). Complexity theory 
of functions is outside of the current discourse. 


An algorithm is either the specification of the rule 
which defines a function and the method for evaluating the rule, 
or simply the evaluation procedure for a given rule. An algorithm 
is frequently presented in terms of a flow chart, where each 
of the nodes of the chart represents a basic operation whose 
meaning and evaluation are (hopefully!) unambiguous. 

The complexity of an algorithm is more basic than 
that of a function. In the case of an algorithm, we need only 
examine the specific evaluation procedure defined by the algorithm 
in order to determine the complexity behavior we wish to observe. 
However, in our complexity analysis, we will eventually come down 
to analyzing the basic elements of the algorithm: arithmetic 
operations, assignments of values to variables, testing condi- 
tions, branching, etc. The complexity of these simple operations 
generally cannot be specified any further. We may choose to 
express these operations in terms of the corresponding Turing 
machine operations and deal with Turing complexity. Although 
this may be adequate in some cases, the complexity of an algorithm 
on a Turing machine does not give much Insight into the complexity 
of the same algorithm written in a programming language and run 
on a computer. 

A program is the Implementation of an algorithm In a 
particular programming language. If the program is written in 
the assembly language of a particular machine, we can determine 
the complexity of the basic elements of the programs in terms of 


the characteristics of that machine. If the program is written 
in a high-level language^ the complexity properties of the basic 
operations would not be completely constrained until we specify 
which machine the program will run on (and probably how the 
language translator to be used on the program would work). How- 
ever^ we may choose to leave the complexity in terms of the basic 
operations of the high-level language. 

2. Complexity Measures 

The computational complexity of a program may be inter- 
preted in many different ways. A scheme for measuring a specific 
t3^e of complexity will be called a complexity measure . Basically, 
a complexity measure is some function of the amount of a partic- 
ular resource used by a program as it processes a specific input 
value. This resource might be time, space, CPU usage, channel 
activity, etc. We might have a program with input n. Asso- 
ciated with is a measuring program $ which is 'faonltoring" 
the execution of 0, $(n) would tell us the amount of a particular 
resource used by to compute 0(n). Thus the program $ is mea- 
suring the complexity of 0. 

In the recursive function formulation of complexity 
theory, a complexity measure is a recursive enumeration of all 
partial recursive functions , to each of which is associated 
a step-counting function $.. f. is constrained to satisfy 
the following two axioms (from Blum [3]): 

1. 0^(n) is defined iff $^(n) defined 


2, M(i^n,in) = JO if $^(n) ^m is a recursive function 
I 1 if $^(n) = m 
By defined, we mean that a function halts for a particular input. 

Uses of Complexity Measures 

As we have stated, a complexity measure provides infor- 
mation on the resource usage of a program. As long as programmers 
have been writing computer programs, they have been concerned with 
the resource usage of their programs; specifically, they have 
wanted to know how long their programs would run and how much 
core they would require. As multlprocessed and time-shared 
computer systems evolved, programmers wanted to know about the 
use of system resources other than CPU time and core: channel 
usage, device usage, secondary storage requirements, supervisor 
usage, etc. The amount of each of these resources used by a 
program would constitute a different measure of Its complexity. 
A theory of computational complexity would give us a 
method for quantitatively analyzing the complexity behavior of 
computer programs and for comparing different programs on the 
basis of this behavior. Hopefully, this theory should be some- 
what independent of which type of complexity is being measured, 
so that the same techniques would be suitable for a number of 
different resources. 

3. Previous Work 

There has been little work done in the area of complex- 
ity measures for programming languages. Meyer and Ritchie [4] 


found some weak bounds on the running time of a class of pro- 
grams called Loop Programs. These programs compute exactly 
the primitive recursive functions. However^ programs written 
in most languages will compute recursive functions which are not 
primitive recursive. Thus, the Loop Program analysis is not 
general enough. 

Ramamoorthy [5] studied the time complexity of programs 
which could be modelled by discrete Markov processes: each 
decision-making element in the program is statistically indepen- 
dent of all others. He felt that his analysis would be useful 
for micro-programmed instruction sequences. However, the tech- 
niques developed would not work in the case of an arbitrary 

4. Graph Model of Programs 

We will be using the graph model to represent the 
structure of computer programs. We present here an informal 
definition of this model. A graph of a program consists of a 
set of nodes connected by a set of directed arcs . The nodes 
represent the statements or elements of the program; they 
will be labelled with a statement identifier or function name. 
The arcs represent the flow of control in the program. They 
determine the execution sequence of the nodes. More than one 
arc may leave a particular node. In this case, each arc will 
be labelled with a unique selector that determines which node 
will be executed next. 


^ successor of a given node is a node which is pointed 
to by a directed arc from the given node. A predecessor of a 
given node is a node which points to the given node. A node 
may precede or succeed itself. A node with no arcs leaving it 
is a terminal node ; it is the last node to be executed. A program 
graph may have more than one terminal node. The node which is 
pointed to by an arc that has no node at its other end is the 
starting node . A graph may have only one such node. The arc 
leading into the starting node may have an input value or set 
of values at its other end. The entire graph will usually be 
labelled with the name of the program. 

ThuSj the program A consisting of the statements 
Sj^, s , 3^ where a^ is a conditional statement with two possible 
successors would be represented as : 

A: X 

X is the input value; s is the starting node, s is the terminal 
node. The successors of s^ are s and s^; the predecessor of s 
is s . T and F are selectors for s 


Chapter II. Complexity of Arbitrary Programs for Single Inputs 
1* Introduction 

Before we analyze the complexity behavior of programs 
over a set of inputs, we will first present methods for determining 
the complexity of a program with respect to one input. This will 
involve defining the complexity of the basic programming constructs 
to be used in the programs, and specifying a set. of rules which 
will enable us to compute the complexity of a group of basic con- 
structs which have been combined. These rules will determine the 
types of complexity measures for which our methods and techniques 
will be valid. It happens that these same rules will be suffi- 
cient for analyzing complexity behavior over a set of inputs, 

2. Constraints on Complexity Measures 

We will require that our measures of complexity satisfy 
three rules. The first two rules are the axioms of Blum presented 
in Chapter I, The first axiom, 0.(n) defined Iff i (n) defined, 
implies that the measuring function (program) $. must depend on 
the entire computation of 0. on Input n; if this were not true, 
then $ might produce an answer even though did not halt. This 
axiom also Implies that when 0. terminates, we have all the neces- 
sary Information to determine the complexity. 

The second axiom, 

M(l,njm) = To if $ . (n) ^ m is a recursive function 

Co if $^(n) ^ m is 
Ll if $^(n) = m 

states that we can always tell if operating on input n will have 


complexity m. In the case of time complexity we could let our pro- 
gram 0^ with input n run until it had executed for m time units. 
If 0^ halted, M(i,njm) = 1. If continued to execute, M(i,n,m) = 0. 

The third rule is the linearity constraint . It is this 
rule which allows us to find the complexity of a group of basic ele- 
ments which are formed into a structured program. The constraint 
may be stated as follows: 

Let A be a program with input x, such that A can be 
divided into two segments s and s where s is the predecessor of 
S2. A graph for A would be: 


We can represent the complexity of A with input x as C(A,x). Simi- 
larly the complexity of s- is C(s ,x). However, s does not have 
input X but rather some transformation upon x, induced by s • We 
can represent the input to s^ as s-(x). The complexity of s^ is 
then C(s2>s-(x)). Linearity requires that for all x for which A 

C(A,x) = C(s^,x) + C(s2,s^(x)) 
Another way to state the linearity constraint is that 
any use of the resource in question must be independent of how 
much was used previously or how much will be used in the future, 
but dependent upon transformations of the input. 


Time of execution satisfies this constraint. Other 
resources which also do are the number of calls to the supervisor 
program, device usage, and channel usage (from the point of view of 
the number of times the channel is used). One resource that does 
not generally satisfy the constraint is the amount of core used by a 
program. Since core may be shared, it may be available for different 
uses at different times. In the previous example, some of the space 
needed for S2's computations may be done in s 's area. Thus, 

C(A,x) ^ C(s^,x) + C(s2,s^(x)) 

However, if space is never reused, then space complexity may be 
incorporated into the general theory. For any resource which obeys 
this constraint, the methods to be presented may be used to deter- 
mine the complexity of a program with regard to the use of that 

3. Complexity of Basic Program Elements 

Our goal is to be able to determine the complexity of an 
arbitrarily structured program. First, we will define the com- 
plexity of the basic elements of our language. This language 
will not be any specific one but will be representative of modem 
high-level languages. Below, we list one possible set of basic 
elements. Naturally, we cannot include every possible program 
construct; however, those listed are found In many languages. 

Arithmetic operations 



Transfer of control 



Function calls 

I/O and supervisor calls 
In discussing each of these elements^ we will use time 
as a sample resource. Of course^ "time" may be expressed in micro- 
seconds, GPU cycles, or any other units. Other complexity measures 
may be handled similarly. 

Arithmetic Operations 

This category includes the common mathematical opera- 
tions. The time complexity of any of these operations is just the 
time required to execute it. If we are dealing with an assembly or 
machine language program, we may express the time in terms of the 
instruction execution time of the corresponding machine. If we 
are writing in a high-level language, and if we know the computer 
which will execute the machine code resulting from our program, 
we may express the time complexity in a similar manner (ignoring 
any compiler optimization). 

If we do not know which machine will execute our program, 
we may choose to measure the complexity in terms of the number of 
additions, the number of multiplications, and the number of other 
independent operations. We may think of having an n-dimensional 
"complexity vector", where n is the number of independent opera- 
tions. The "unit vector" for each of the dimensions is the com- 


plexity of a basic operation; the coefficient of this unit vector 
is the number of such operations which have occurred. We can reduce 
this vector only if we express the unit complexities in terms of 
something else^ such as the machine instruction times of a specific 
computer. Then we can obtain one value for the complexity^ as we 
did in the case of an assembly language program. 

Note that we are treating the complexity of these opera- 
tions as having a fixed value, independent of the value of the 
operands. If this assumption were not true, we would have to examine 
the sub-operations which form an operation until we found some con- 
structs which were complexity- invariant. We will need this condi- 
tion when we examine complexity for a set of inputs. 

Assignment; Transfer of Control 

By assignment, we mean the assigning of the value of one 
variable to another. Transfer of control is the familiar BRANCH 
or GOTO statement. These two constructs may be treated in the same 
way as the arithmetic operations: either their complexity may 
be expressed in terms of instruction execution time or they may 
be treated as two of the independent operations. We also assume 
that the complexity of these operations is a fixed value. 


The prototype of our conditional statement will be: 

IF p(x) THEN 8^ ELSE S2 


P is some test on the value of x which does not change this value. 
If this test is satisfied (p(x) TRUE), then s^ will be executed; 
otherwise S2 will be executed. We can represent the structure 
of the conditional by the following graph model: 

Using the linearity condition and noting that p does not change 
the value of x, the complexity of our conditional will be: 

l*x) = C(p,x) +|c(Sj^,x) 

C(cond,x) = C(p,x) +fc(B^,x) if p(x) TRUE 

if p(x) FALSE 

p, 8^, and S2 may all represent complex constructions, which can be 
broken down to basic elements. 


In most languages, the programmer has the ability to 
execute a section of program repeatedly, depending upon certain 
conditions. This Is the basis of the Iteration statement. We 
usually have a variable, defined only for the iteration construc- 
tion, whose value is incremented from a lower limit to some upper 


limit while the statements within the bounds of the iteration state- 
ment (the body ) are executed for each increment of the variable. 
Some languages allow additional features: multiple ranges for the 
control variable, negative increments, negative lower limits, 
attaching a conditional test to the iteration, and others. Examples 
of the iteration construction are the FORTRAN DO statement, the 
ALGOL FOR statement, and the PL/l DO statement. 

We will use a particularly simple form of iteration state- 
ment. We will retain only the concept of executing a body of state- 
ments for a certain number of times. This is the LOOP block and 
has the following form: 




The semantic interpretation of the LOOP block is that the body is 
executed N (contents of N) times in succession. Changes to N 
within the body do not affect the number of times that the body 
Is executed. The body may contain other LOOP blocks; thus they 
may be nested to any level. 

The complexity of a LOOP block may be interpreted in 
several ways. We may view the LOOP structure as equivalent to N 
physical copies of the body of the block. We may then compute 
the complexity using linearity. Alternatively, since the LOOP 
block is usually the feature of a high-level language, we may 


examine its translation in machine langtiage. This will involve 
the initialization of a dummy variable, the body of the block, the 
incrementing of the variable, a test to see if we have exceeded the 
number of iterations, and a transfer to the beginning of the body. 
The complexity of the block could then be computed using the 
complexity of these elements and the linearity condition. 

We will use the simpler interpretation of complexity. 
If we denote the body of the LOOP block by s, and assume that all 
statements within s are a function of the variable x, then the 
LOOP block 



will have complexity 

C(LOOP-BLOCK,x) = C(s,x) + G(s,s(x)) +C(s,s^(x)) + 
C(s,s^(x)) +...+ C(s,s^"\x)) 

where s denotes the functional composition of s with Itself i 

Function Calls 

A function or subroutine call involves a call to the 
subprogram executed in the calling program, and the execution 
of the subprogram itself. We will assime that the call state- 
ment is a basic construct similar to an arithmetic operation and 


that its complexity is independent of its argument; furthermore, 
we will assume that it does not change the value of its argument. 
The value of the call statement argument , and hence the input to 
the subprogram, will be some transformation of the original input 
to the calling program. As an example, consider the program A 
with input x which contains a call to the function B with argument 
X (actually some transformation on the original value of x). A 
could be represented in graphical form as: 

The complexity of A would then be 

C(A,x) = C(Sj^,x) + C(GALL) + C(B,Sj^(x)) 
+ C(S2, B«Sj^(x)) 

All occurances of x in this expression denote the original input 
value of X. 

I/O and Supervisor Calls 

We can view these constructs in the same light as 
arithmetic operations. For a particular machine, the time com- 
plexity of these elements is simply the time needed to execute 
them. In a high-level language, we may treat them as independent 



Types of Complexity 

Certain elements have a constant complexity which does 
not depend on their inputs. This group includes arithmetic opera- 
tions^ assignment^ transfer^ and the supervisor operations. Condi- 
tional statements have a constant complexity for those inputs for 
which p(x) is true^ and another constant complexity for those inputs 
for which p(x) is false. This assumes that the elements composing 
Pj s^ and s^ are all of constant complexity. Finally^ we have 
loop's and subroutine calls which have a complexity dependent upon 
the input and number of iterations for the LOOP block and the input 
for the subroutine or function. 

If certain supervisor operations have different complexi- 
ties for different inputs, we may treat them as subroutines. 

^. Complexity of Arbitrary Programs 

Having defined the complexity of our basic programming 
constructs, we can use these complexities and the linearity condi- 
tion to find the complexity C(A,x) of any program A with any 
input X for which A halts. (We may easily extend our results to 
include the case of programs with multiple inputs. ) This approach, 
however, does have shortcomings. 

The complexity of a program with respect to one input 
will not usually give us much information about the general com- 
plexity behavior of the program. We will know even less if the 
program does not halt for that input. If we wish to learn more 


about the complexity behavior, we will need to repeat our calcu- 
lations for a large set of inputs. Although the program may 
exhibit the same complexity behavior for many different inputs, 
we will not be able to take advantage of this relationship because 
we are dealing with each input individually. The domain of input 
values is arbitrarily large until we bind our program to a partic- 
ular machine or language specification. Because of the nature of 
this domain, we cannot bound the values of C(A,x) since A defines 
an arbitrary partial recursive function. Finally^ we have no 
facility for comparing the complexity behavior of two different 
programs other than inputting the same value to both programs and 
calculating the resultant complexity. 

5. The Set Approach 

The remaining chapters will use the work on single input 
complexity to develop another approach to program complexity. This 
approach examines the complexity behavior of a program over a 
finite set of inputs which have certain useful properties. The 
set approach will make it easy to examine such behavior as the 
expected value of complexity and maximum and minimum complexity; 
the complexity structure of a program will become more apparent. 
Also, the set approach will enable us to examine complexity rela- 
tions between different programs. 


Chapter III. Program Equations and Set Complexity 

!• Introduction 

Having defined the complexity of the basic program 
elements and developed methods for determining the complexity of 
a program with respect to one inputs we are ready to deal with 
program complexity with respect to a set of inputs. We develop 
the concepts of the program equation and a valid set of inputs 
to aid in the complexity analysis. We make some simplifying 
assumptions about the type and number of Inputs to our programs 
and the type of components in these programs. In the next 
chapter we remove these restrictions to obtain a more general 
model. An explanation of the mathematical notation used in 
this and later chapters will be found in the appendix. 

2. Input Sets 

We will assume a simplified form of input structure 
for our programs. These programs will have only one input, 
which will be a non-negative Integer; further, we will assume 
that all operations upon this input result in non-negative Integer 
values. Given a program A which satisfies these conditions, 
let U- be the set of non-negative Integer Inputs for which A 
halts, and which are valid inputs to A. By a valid input, 
we mean an input which A Is actually Intended to process. 
Thus, if A is meant to process only even non-negative Integers, 
then U^ contains only these integers, though A may halt for 
some odd integers or, in fact, for all odd integers. 


Next, let U2 be the set of allowable non-negative 
integer values for the machine on which A is running or for 
the language tn which A is written. (If both the language and 
the machine limit the set of values, we will use those conditions 
which are more restrictive.) V^ may be quite large, but it is 
always finite. We can now define a valid input set X to the 
program A as : 

X = u^n U2 

We see that X is finite and for all inputs x in X, A halts. 

We may be interested in examining only some of the 
valid Inputs to A at any particular time. If X'OC, we will 
say that X* is also a valid input set to A. x' has the same 
properties as X except that it does not contain all the valid 
halting Inputs for A. We will refer to X as the maximal valid 
Input set to program A for a particular U2 - i.e., for a 
particular machine or language realization. 

3. Program Equations 

We now define a method for obtaining an equation 
representation for a program. We will initially assume that 
the program contains no LOOP blocks or subroutine calls, and 
has only one input.- These restrictions will be removed later. 

The concept of a program equation is based on the work 
of Zeiger [6]. He derives some relationships between programs, 
polynomials, and power series. We start the derivation of 


the equation by putting our program A into graphical represen- 
tation. With each node of the graph, we can associate a func- 
tion on the inputs to that node. Since we have temporarily 
eliminated LOOP's and subroutines, we have remaining arithmetic 
operators, assignment, transfer, conditionals, and supervisor 
calls. Transfer does not change any values. Assignment is a 
type of arithmetic operation, and supervisor calls change values 
of variables in specified ways. Thus, we have two general types 
of functions: arithmetic and conditional. 

An arithmetic function f maps a set X into another set 
F, defined as: 

F = f (X) = { f(x) I xeX } 

A conditional function p maps X into a set P. We will say that 
p(x) « X if X satisfies the predicate p; otherwise p(x) is 
undefined. Therefore, p acts as an identity function for those 
inputs which satisfy its associated conditional test. We can 
see that p(x) is defined if and only if p(x) = x. Then 

P = P(X) = { xeX I p(x) is defined } 

We are using the conditional function in a different 
sense than the conditional statement of the preceding chapter. 
That statement had the form 

IF p (x) THEN s ^ ELSE 82 


p(x) is assumed to take either the value TRUE or FALSE. If 
p(x) = TRUE, we execute s^^; otherwise, we execute S2. In the 
case of a conditional function, we have the following situation; 

p(x) defined / \ p(x) defined 

We define the function p(x) as follows: 

p(x) defined iff p(x) undefined 

If p(x) is defined, its value is x; we take the arc with the 
appropriate selector and ezecute s- with input x. We cannot 
execute s^ because the selector leading to this node was not 

Conversely, if p(x) is undefined, p(x) is defined; 
its value is x. We then execute s^ with input x. One case or 
the other (but not both) must happen. We also have the relation 

p(x) = p(x) 

so that if our conditional function is p(x), our selectors will 
remain the same. We will abbreviate ''pCx) defined'' by '*r" 


(TRUE) and '^(x) defined" by ''F'' (FALSE). This is in keeping 
with the standard notation of program conditionals. 

Having specified the function associated with each 
node of the program graphs we define the equation of the pro- 
gram as the summation of all possible functional paths from 
the starting node of the graph to any terminal node, A func- 
tional path is defined as the composition of the functions 
associated with the nodes which comprise the path. The resulting 
composed function is applied to the set of inputs. The stun 
of chese functional applications is set equal to the function 
defined by the entire program applied to the set of inputs. 

Given a program A with input set X, its equation would 

A(X) = Zf,(X) 

i=0 ^ 

where each f is a composition of simpler functions. The 
summation is of infinite extent because an arbitrary program 
graph contains an infinite number of paths. 

We illustrate these concepts with a sample program: 



The simplest path through A starts at p and takes the F arc 
to the halting node. We will assume that this latter node has 
no functional or complexity significance. If we recall that 
F is equivalent to '^(x) defined", the first term of A(X) will 

f q(X) = p(X) « { xeX I p(x) defined } 

The next simplest path starts at p, takes the T 
arc to g, returns to p, and then takes the F arc to the halting 
node. The composition of the functions encountered in this 
path is (with the right-most function encountered first) 

f j^ = P»g-P 

This function applied to X yields the next term of A(X): 

f^(X) ^ [ g(x) I xgX & p(x) defined & p(g(x)) defined ] 
Continuing In this manner we get the complete equation: 

A(X) »7(X) + P*g-p(X) + p.(g.p)^(X) + ... 

= Ep«(g«p)^(X) 


where (g»p) = the identity function. Xhsxe Is no unique corre- 
spondence between a particular index and a particular functional 
path. In this case, we have calculated functions in order of 

increastng path length. It happened that 

f^ ^ P*(g'P) 

However, any other one-to-one correspondence between the f. 
and the functional paths would work. We will conclude this 
example later in the chapter. 

The application of f to X results in a set of values. 
We define 

F^ = f^(X) 

The 4-*s in our equation are to be interpreted as set union. Then 

F.C A(X) and U F^ = A(X) 
^ i=0 ^ 

Each f Is generally composed of arithmetic and 
conditional functions. Suppose f . = g •...•g.. We will say 
that f.(x) Is defined If and only if for all conditional 
functions g €{ S^^g^.j* •••» gj^ }^ g. applied to its argxanent 
is defined; I.e., g.(g, , •. . . 'g, (x)) is defined. Then 
f^(x)€F^ if and only if f^(x) is defined. Now we can define 
the set of elements in X which exactly yield each F : 

X^ = [ xeX I f^(x) is defined } 

We then have the relation F = f.(X ), 


Simplification of the Equation 

We can reduce our equation if we require that X be 
a valid input set to A. , We then have that X is finite and for 
all inputs in X^ A halts. Since X is finite, there are only 
a finite number of non-empty X . Let m be the greatest integer 
for which X ?^ (the empty set); then for all i>m, X =0 and F,=0. 
We then have: 


1. A(X) = Zf.<X) 

i=0 ^ 

2. U X = X (it may be that some of the X are empty; 
i=0 ^ ^ 

this will occur if the corresponding F. 

is empty) 

3. X nx =» if if^j (because programs are deterministic, 

any input undergoes only one trans- 

4. Each f is composed of only a finite number of sub- 
functions since A halts for all x in X. 

Thus, given a valid input set, we can reduce our program equa- 
tion to a finite number of terms, each term of finite extent. 
We also get a partition { Xq,X ,...,X } of our original input set. 

4. Complexity Equivalence Classes 

The subsets of X, { X. }, have the following property: 
for all xeX., the functional path associated with x is the same. 
Since we have allowed no LOOP blocks or subroutines in our 
programs, and since all x in X. take the same branch from every 


conditional test, the complexity of this functional path is con- 
stant. This complexity is completely defined in terns of the 
functions in the path and has a value C(A,x)j xcX . X 
will be called a complexity equivalence class . Associated 
with each X. will be an equivalence class complexity * C(A,X. ), 
By definition. 

C(A,X^) = C(A,x), xgX^ 

The complexity information for program A with input 
set X is contained in the two sets 

{ Xq,X^,...,X^ } 

{ C(A,X ), C(A,X.),_.,C(A,X ) } 
u i m 

We observe that many programs do not treat every input differently. 
Often, many inputs will be similarly processed. Therefore, we 
are led to believe that the number of equivalence class will 
often be smaller (though not always much smaller) than the 
number of distinct input values. In this case, we have a (rela- 
tively) compact description of the complexity behavior of A. 

Uses of Complexity Equivalence Classes 

The complexity equivalence classes may be used to 
calculate the following quantities with respect to a particular 

input set: 

1. Expected value of complexity 


2. Maximum value of complexity 

3. Minimum value of complexity 

In a later chapter we will show how these classes are useful 
in determining relations between different programs. 

Expected Complexity Value 

Expected complexity value will tell us the average 
resource usage by a program over a particular input set. The 
expected value of a function g over a discrete-valued domain 
X is (see, for example, Drake [7]) the sum of the products of 
values of the function at points in the domain and the prob- 
ability that the particular point will be chosen: 

E(g,X) = s g(x).Pr(x) 

If C(A,X) is the expected value of the complexity of A with 
input set X, 

C(A,X) « E C(A,x).Pr(x) 

Since U X « X, 

i«0 ^ 

C(A,X) - E S C(A,x).Pr(x) 

But for all xgX^, C(A,x) = C(A,xp; and since x^cX^ and x^eX^ 
are independent events. 

Pr(X ) = S Pr(x) 


Th en. 

S C(A,x).Pr(x) = r C(A,X,).Pr(x) 

xeX^ xeX. 

C(A,X^)(2 Pr(x)) = C(A,X^).Pr(X^) 


/. C(A,X) = E C(A,X )-Pr(X ) 

The probabilities Fr(X.) are based on any method 
used in selecting the inputs to A, and take into account any 
1 P^^-o^^^ knowledge of the selection method. If the selection 
of inputs is random, 

Pr(xp » I X^ |/| X 

If X. « 0, then for any selection method, Pr(0) « 0. The value 
of C(A,X) is not changed by averaging over empty sets. 

Maximum Value of Complexity 

The maximtm complexity is an upper limit on the usage 
of a particular resource for any execution of a program, given 
an input value from the input set in question. To compute 
this quantity, we need only find the maximum value of C(A,X. ) 
over the X . We must remember that some X may be empty; so 
we only look at C(A,X ) for non-empty X.. The maximum com- 
plexity will be denoted C (A,X). 


Minimum Value of Complexity 

The minimum complexity is the corresponding lower limit of 

resource usage by the program. It is the minimum of C(A,X.) 

over the non-empty X,. It will be denoted C (A,X) 

^ min 

^max^^^^-^ ^"^ ^min^^'^^ ^^^ important quantities. They 

say that any execution of the program will require C CA^X) of 

min^ ^ ^ 

resources, but never more than C ^ (A,X), when inputs are taken 


from X. 

5. Conclusion of Example Program 

Returning to our sample program introduced earlier in this 
chapter, we will specify the input set and the functions com- 
prising the program: 

X= { 1,2,3,. ..,10 } 

g(x): x:=2lc (x is assigned the value 2x) 

p(x) = X iff xfi5 

p(x) = X iff x>5 

A(X) = E f (X) 

1=0 ^ 

— — — 2 

fo "" P ^1 " P*(g'P) ^2 " P*(S'P) - • • 

^0 " ^0^^^ = { xeX I 7(x) is defined } = [ 6,7,8,9,10 } 
Xq = f 6,7,8,9,10 } 

F^ = f^(X) = p([ 2,4,6,8,10 }) = { 6,8,10 } 
X^ = { 3,4,5 } 


Fj - fjCX) -?({ 4,8 }>*{ 8 3 

:. ax-,» X ;. 

;. (Vna4) [F^- X^- 0] 

;. A(X) - i£,(X) 

1-0 * 


■ '-'-O ;• • 

P<^M?^^»>f «eXQ 

X v\t-^-:-^-'':^^*<^'-^r-^^.Cr 


Nota thftt eh« conplaxlty of the conditional tMt (p lt»«lf) 
10 eh« MM i»h«eher p(x) or p(x) U A^tis^^ ttmt^rm. 

C(p>x) - C(p,x) - c , VxeX 


C(A,Xj^) - C(A,x), xeX^ 

C(A,x) - C(p,ae>i^^(i;fc>4 C(?,g(k» 

c + c_ + c « 2 c + c 

.^ ,^ . V 

K:>{ilisrj :.F ;; ^)? ■; A'>-^. 

C(A,X^) - 3Cp + 2Cg 
C<A,X-) - 4c + 3c 

i 'Ol'l%?^S,,,?t.d ■| ■*«'^Jt 


To determine C(A,X)^ the expected complexity^ we need to know 
the values of Pr(X^). Let us assume a random distribution. 


Pr(X^) = I Xj/| X I 

C(A,X) = i C(A,X,).Pr(X.) = 
i=0 ^ ^ 

5 c +3 (2c +c ) + 1 (3c +2c ) + 1 (4c +3c ) 

loPIo PS loPS loPg 

= 9 c + 4 c 
5 P 5 8 

Cmax<^'X> = C<A,X3) = Uc^ + 3Cg 

6. Suranary 

We have developed several important concepts for the 
complexity study of programs - valid input sets, an equation 
representation of a program, complexity equivalence classes, 
and equivalence class complexities. We have seen the use of 
these concepts in the analysis of complexity behavior. Finally, 
we have shown how the equivalence classes and their associated 
complexities have given a compact and orderly representation of 
the complexity information for a program over a set of Inputs, 


Chapter IV. Complexity of Advanced Constructs and Input Schemes 

1. Introduction 

In this chapter we extend our complexity analysis pro- 
cedures to include programs with additional programming features, 
such as subroutine calls and LOOP blocks, and also programs with 
more than one input. We also discuss input data types other than 
the non-negative integers and finally variable length inputs. 

2. Subroutine and Function Calls 

Let us assume that we have a program A with valid input 
set X. A has the finite equation representation 

A(X) ^If. (X) 

i=0 ^ 

We can express the j term, f ^ as a composition of subf unctions: 

Suppose g. is a call to the program B. Let Y be the maximal valid 
input set to B. The equation for B Is then 

B(Y) = S h.(Y) 
1=0 ^ 

B will be called from A at point g. within f . with a set of values 
X' = gj^.....g^(X) 


where the series of composed functions is the transformation 

applied to the elements of X, If we assume that A is working 

correctly, then all values passed from A will be valid inputs 

to B. Thus X'c Y. The equation for B(X') will have, at most, 

as many terms as that for B(Y). Then 

B(X') = Sh (X'), qsp 
i=0 ^ 

To obtain the complexity equivalence classes for A with 

input set X, we cannot simply calculate the X corresponding to 

the F- = f . (X). The reason is that there is no unique complexity 

associated with X since f contains a call to B which has q-fl 

distinct functional paths associated with it. To get the true 

equivalence classes for A, we will have to substitute the terms of 

B(X*) into A(X) between g- - and g in f . Assuming no other 

function calls in A, this procedure will yield an expanded set of 

functions f *, each having a unique functional path. Any value 

computed in B and returned to A will be available to gi^. n* This 
expansion yields the following equation for A: 

J-1 q 

A(X) o Sf.(X) + g„'...gv.i*( Sh ).g .....g (X) + 

1=.0 ^ ° '^^^ 1=0 

I f fx) = rf/(x) 

l=j+l ^ 1=0 ^ 


We can now calculate the complexity equivalence classes 

X^ = { xeX I f^'(x) is defined } 

and the associated class complexities C(A,X ). 

We are assured of finding a new finite representation 
for A(X) because: 

1. A(X) (unexpanded) has a finite number of terms. 

2. B(Y) has a finite number of terms. 

3. A is assumed to be operating correctly so that X'c Y. 
Therefore, B(X') has a finite representation. Intro- 
ducing B(X') into A(X) yields a new finite representa- 
tion for A(X). 

This method is general enough to handle multiple lavels 
of subroutine calls, multiple calls to the same subroutine from 
the same calling program, and recursive subroutine calls. To 
Illustrate multiple levels of calls, suppose that in the pre- 
vious example, B called a program C with input set y'. We would 
find the equation for C(Y'), substitute this into B(X'), and sub- 
stitute the expanded B(X') into A(X), We know that the process 
of subroutine calls must terminate since A halts for all input 
values in X. 

For the case of multiple calls from the same program, 
assume A called B twice, once with input set X' and once with set X". 
We would use our analysis procedure to find B(X*) and B(X") and sub- 
stitute these into A(X). 


Suppose A calls itself as a subroutine. These recur- 
sive calls must terminate because of the halting condition. If 
the set of values passed from A to itself is X'^ we know that 
X'g X. After finding A(X'), we substitute it into A(X). 

Example of Subroutine Call 

We will consider a program A with input set X which 
calls a program B with maximal input set Y, 


X = { 1,2,3, ...,9 } 
p(x)=x iff x^ mod 3 
g(x): x:«x+l 


Y = { 3,6,9,12,15, ...,3.n } 
h(y): y:=y/3 
<i(y)='y iff y=o mod 3 

Denoting the node "CALL B(x)" by "b", we have the following 
equations : 

A(X) - b.p(X) + b.p.g.p(X) + b.p.(g.7)^(X) + ... 

- E f . (X) 
1-0 ^ 

B(Y) - q.h(Y) + q.h.q.h(Y) + q.h.(q.h)^(Y) + ... 

-Eh. (Y) 
1-0 ^ 

Using our analysis on just the equation for A(X), we get the 
following classes; 

Xq = [ 3,6,9} X^ = C 2,5,8 } Xg - { 1,4,7 } 

A(X) = !: f , (X) 

1=0 ^ 


However, since the inputs in any X. undergo different transforma- 
tions in B, these are not complexity equivalence classes. We will 
have to substitute B(X') into A(X) wherever B is called from A. 

Let Xl be the set of inputs to B when B is called from 

fp. Then 

X^ = b.p(X) = { 3,6,9 } C Y 
Ho=hQ(X')= { 1,2 } x;^= [3,6 } 

Since X' = X'^ U X^^. 

B(XJ) = ShjX') = q.h(Xj) + q.h.q.h(X^) 

Similarly, let XJ be the set of Inputs to B when it is called 
from within the function f ^. Then, 

x; = f^(X) = C 3,6,9 } = X^ 

B(X') = ih (x;) 
1 1=0 ^ ^ 

Finally we let XJ be the Inputs to B when it is called from f . 

X^ = f2(X) = [ 3,6,9 3 = X' 

B(X:) = ih (X') 
1=0 ^ "^ 


We can now substitute the three equations for B into the appropriate 
places in A(X) to get the expanded version of this equation: 

A(X) = B(Xj;).fo(X) + B(X{).f^(X) + B(X^).f2(X) 

= hQ.fQ(X) + h^.fQ(X) + hQ.f j^(X) + h^.f^(X) 

+ hQ-f2(X) + h^.f2(X) 

= Ef;(x) 

i=0 "■ 

Each of the classes X., i=0, 1,...5, now corresponds to a set of 
inputs with a single complexity value; each X is a complexity 
equivalence class. 

3. LOOP Blocks 

We have discussed the LOOP block in a previous chapter and 
defined the complexity of this construction as follows: 

If the body of the block is denoted by s and all state- 
ments within s are a function of x, then the I/X)P block 



will have complexity 

C(s,x) + C(s,s(x)) + C(s,s (x)) + ... + G(s,s " (x)) 

Ue can now include LOOP's into our theory of program equations 
and complexity equivalence classes. We will illustrate the 


handllng of loops by several examples. 

Suppose we have the following program A with input x: 


If A has a valid input set X, we may represent A in graph format 
as follows: 


_x ^ 



The dotted lines delineate the scope of the LOOP block. The '*x" 
just outside the box defines x as the iteration control variable. 
The equation for A(X) will have to indicate that g and 
h will be executed a different number of times for each value of 

■y '^.■-"■.■■-:5Tf?^^:|^, 


X In X. Asstxmtng no attributable coiaplcxlty to^ND", the equation 

A<X) « Z (g*hr(X) 

If |x|«n, this sum will expand to h terms. If g and h have a 
constant associated complexity^ there will be n complexity equiva- 
lence classes for A witii set X, In this case. 

X^ » { x^ 3 *i»d C(A,X^) « C(A,x^) 

Let us consider the more complicated program B: 

■^ ^^^^/^T» '■: -* ^W*- -* "^i^ : 

^,;f-'- - U'-I 

m^l '\: j 

^■■r i-' .1. ^^T ,}V*.- * --^i, 



Now the LOOP block may be executed many times, the number of 
Iterations each time (i.e., x) dependent upon previous executions. 

To calculate B(X) we use our basic method: find those 
inputs for which B halts on the first pass through p, on the 
second pass, etc. Sum these terms to get B(X). 

If fp is the functional path which terminates on the 
first path through p, 

fo(X) = ?.E (g»h)''(X) 

Similarly, if f . denotes the functional path which ends on the second 
pass through p, 

f,(X)-p.j: (g»h)* .p.s (g»h)^(X) 
x'eX' xcX 


Continuing in this manner. 

B(X) » E f, (X) 
i-0 ^ 

Since X is finite, there exists an n such that for all l>||^ 
fj^(X) - 0. 

.'. B(X) = S f,(X) 
1=0 ^ 


While this is a valid equation for B(X), each 
X. = { xgX I f.(x) is defined } does not correspond to a single 
value of complexity. Each f, will have to be subdivided by 
expanding out the summations. The subterms, f!, will correspond 
to single values of complexity; and the X,, 

X^ = { xeX I f |(x) is defined } 

will be complexity equivalence classes, 

4, Multiple Inputs 

We now extend our complexity analysis procedures to 
programs with more than one input. We will deal specifically 
with the case of two inputs. The generalization to programs with 
n inputs is immediate. 

Let A be a program with two inputs, x and y. Analagous 
to the one input case, we define the set U. as 

U, = { (x,y) I A halts for non-neg. integer inputs x and y; x and 
y valid } 

We then define U^ as the set of all ordered pairs of non-negative 
integer values for the language in which A is written or the machine 
on which A is running: 

U^ = [ (x,y) ] X and y are allowable non-neg. integer values } 

Then the maximal valid input set to program A is 

z = Uj_nu2 

As before Z'c Z is also a valid input set. 

We can then proceed to find the functional paths of the 
program graph from the starting node to a terminal node. The 
function f associated with each path maps ordered pairs (x,y) 
to (x*,y'). f is composed of arithmetic and conditional func- 
tions. For any (x^y)eZ, we can calculate a complexity C(A, (x,y)) 
defined by a path through A. 

Using the functional paths through A and noting that Z 
is a valid input set we can obtain the finite equation for A(Z)2 

A(Z) = Ef.CZ) 

i«0 ^ 

The f . Imnediately lead to the complexity equivalence classes 
\ = { (x,y)eZ I f ((x,y)) is defined ) 

and the equivalence class complexities 

C(A,Zj^) = C(A,(x,y)), (x,y)eZ^ 

Before finding the Z , we may have had to expand the f to take 
care of any function calls or LOOP blocks. We can now calculate 

C(A,Z), C ^ (A,Z), and C (A,Z) as before, 

max tnin 

-5 0- 

If we define the sets X and Y as 

X = [x|ay such that (x,y)eZ} 
Y = {y|3x such that (x^y)£Z} 

we can easily see that Zc XxY. For (x,y)cZ implies that xeX 

and yeY which imply that (Xjy)eXxY. Note that Z is not necessarily 

equal to XxY. If, for inputs x^eX, ^o^^^ "^ does not halt, then 

(x^^yg)^ z. 

Inputs vs. Variables 

Although A may have n inputs, it may have m variables 
(including input variables) internal to it, in>n. One way to handle 
the m-n variables which are not inputs is to express them in terms 
of transformations on the inputs. Thus if y is not an input 
variable, then before it is used, it must have a value assigned to 
it. This value is a function of the n-tuple of input values: 

y = f((Xj^, ...,x^)) 

Another way to solve the problem is to view A as having 
m inputs. Those variables which are not actual Inputs have a 
default value assigned to them, say zero. If A is working correctly, 
these variables will be assigned new values before they are used for 
their original default values. 


5. Different Data Types 

So far, we have limited our discussion to programs which 
operate on non-negative integers from to intmax - an upper bound 
imposed by either the language in which A is written or the machine 
on which A is running. It is not conceptually difficult to extend 
the complexity analysis procedure to cover other types of input. 

In most implementations, negative integers range from 
-1 to some lower limit: negmax . If we wish to analyze a program 
A with an integer input (positive or negative), then our maximal 
valid input set X would still be defined as U-nU2> but now 

U- = {x| A(x) halts and x a valid integer input} 

U« - {x| X an integer and negmax s x ^ intmax } 

We would then procede as before to find the program 
equation A(X), the complexity equivalence classes X , and the 
subset complexities C(A,X. ). 

We may also easily handle character input (the character 
set on a machine is always finite) and floating point numbers 
(always finite in number because of the limitations on the magni- 
tude of the exponent and the precision of the fraction). In the 
most general case, we would have a program with several inputs of 
different data types. The work on multiple inputs would apply here^ 
generalizing to n-tuples with components of different data types. 


6. String Input 

We have not yet considered programs which have string 
or variable length input sequences. An example of such a program 
would be a compiler whose input is a series of characters in the 
source language which it is required to translate. There is often 
only one input variable in such programs; each time a new character 
is read, the value of that character is placed in the input variable. 
However, unlike previous programs where current values of variables 
could be defined in terns of transformations upon older values, 
the new value of the input variable in the case of string input is 
not generally definable in terms of previous values. Although we 
might like to treat this new value as a separate input, the number 
of such new inputs is indeterminate, so we cannot directly apply 
our previous methods. 

One way out of this problem is to observe that compilers 
and similar programs usually read their string of inputs until the 
string is exhausted. Therefore, given an input string of n charac- 
ters, we know that the program processing this string will read in n 
distinct input values. If we view an n character string as being 
n inputs to the program, then each length string defines a diff- 
erent input situation. Thus, when analyzing the complexity of a 
program with string input (where the string is always read in its 
entirety), a set of valid inputs would be defined as a subset 
of all strings of a fixed length, say n. We would then treat the 
program as having n inputs. 


Chapter V. Results in the Complexity Theory of Programming 

1. Introduction 

We are now ready to investigate some of the proper- 
ties of the complexity-theoretic concepts previously intro- 
duced. We define some complexity relations between programs, 
define the complexity of concatenated programs, and study 
equivalence of programs in the light of complexity theory. We 
prove some results about these areas. 

2. Preliminary Definitions 

We first define a special form of the complexity 
equivalence classes of a program; this form will be used in 
most of the definitions which follow. It is a standard which 
will be used in comparing equivalence classes of different pro- 

Definition ; Given a program A with input set X, 

we define the set X. as follows: 


X. »« f X, I X, is a complexity equivalence class 
A ** 1 ' 1 

of A with input set X) 


Definitio n; X. is said to be in normal form iff 

— — — A ' ' ' ' ' ' ■ ■ " 

[X^7^0 (empty set) and i^^j ^ C(A,X^ VC(A,X,) ] 


Normal form implies that all the complexity equivalence 
classes X^ contain at least one element of X and the complexity 
of any equivalence class^ C(A,X. )^ is unique. It is this 
normal form which allows us to compare the set of all equiva- 
lence classes of two different programs. 

If X^ is not in normal form, it may be put in this 

form by deleting any X^= 0, and if C(A,X^) = C(A,X.), then 

replacing X and X. by a new equivalence class X = X U X. 

where C(AjX^.) = C(A,X^). Note that the deletion of X = or 

the creation of X does not change C (A,X), or C . (A,X), or 
*-J max min 

C(A^X). In the case of G(A,X), we have 

X^= -> Pr(X^) = 

Also if we form X = X U X., then since X fl X = 0, 
Pr(X^j) = Pr(X^) + Pr(X^) 
;. C(A,X^) • Pr(X^) + C(A,Xj) . Pr(X.) 
- C(A,X^j) . (Pr(X^) + Pr(X^)) 
= C(A,X^^) . Pr(X_) 

If we replace X and X. by X ,, C(A,X) remalas the same* 

Next we define a shorthand notation for the complexity 
equivalence of single inputs and equivalence classes of inputs. 


Deflnition ; Given a program A, the relation = is 
defined as follows: 

For K^, x^ valid inputs to A, x = X2 iff C(A,x ) = 


For X., X. equivalence classes of A, X = X. iff 

J 1 A i 

C(A,X^) = C(A,XJ 

It is easy to see that = is an equivalence relation (reflexive^ 
symmetric, transitive) since it is defined in terms of another 
equivalence relation (=). 

Lemma 1 ; If X. is in normal form and x-^x^gX, then 
Xj^«=^X2 iff (a unique X^eX^ such that x ,x eX ) 

Proof: (<?.) Xj^,X2cX^-> C(A,Xj^) = C(A,X2) •♦ \=^^2 

(">) Xi=A^2"* ^(^^^1^ ^ C(A,X2). Suppose 
Xj^eX and x^eX^. Then X =^Xj^. By normality of X^, ^f\^ 
Then Xj^,X2eXo X.eX . Since the X are disjoint, X. is unique. 


The lemma gives us another statement of the normal 
form condition - all inputs with the same associated complexity 
are in the same equivalence class. This result will be useful 
in later sections. 


3. Relations Between Programs with Identical Input Sets 

We now turn to relations between programs which have 
the same valid input set.. We will develop a series of equiva- 
lence relations between such programs, based on different com- 
plexity properties of the programs. 


The first equivalence relation, which we now define, 
is also the weakest. It specifies a relation between programs 
which divide the same input set into the same complexity 
equivalence classes. 

Definition ; Given programs A and B, both with input 
set X and with X. and IL in normal form, A and B are similar 
on X iff X^«Xg. 

Similarity between two programs on a given input set 
is equivalent to saying that two inputs to one program have the 
same complexity if and only if they have the complexity when 
input to the other program. This may be formally stated in the 
following theorem: 

Theorem 1 ; A and B are similar on X iff 
(VXj^,X2eX)IXj^=^X2<?=> ^"^^^2^ 


Proof: (=>) Suppose ^-^"=^^^2' "^^^^ ^^ lemma 1, 
x^jX^e X^j X^G X^ where X^ is in normal form. But by similarity^ 
^A^'^B' ^^^" ^i^ ^B* Therefore, ^-^=^^2' ^^^^ewise, ^i=^^2^> ^rA^2' 

(<r=) Construct X. £ X from all xeX such that 

X =^x ; similarly construct X ! e X^ from all xeX such that x = x . 
^ i- i B B i 

Since x =^x^ iff x ^^x^^^, X^= X_J. Continuing for all x eX, we 
see that X^e X^ iff X^e X^. Thus X^= X^. None of the X are 
empty (because = and = are reflexive); if C(A,X. ) = C(A,X.), 
then by hypothesis C(B,X ) - C(B,X ). Therefore in normalized 
form, X.« X^. Thus A and B are similar on X. QED 

Although the complexities for the two inputs (x- and x^) 
are the same in each program, the magnitude of this complexity 
for the two programs is, in general, different. Thus x.^.Xj 

and Xj^«gX2 ^^ ^^^ imply that C(A,Xj^) = C(B,Xj^). 

Similarity between two programs is preserved by 
taking subsets of the original input set, and by Intersection 
of different input sets which induce similarity. 

Theorem 2 ; if A and B are similar on X and also on Y, 
they are similar on Z C X and on X n Y. 

Proof: (Subset) Let [x. | = n. Define Z =Z D X , 

VX, ex.. Then 
i A 


u z = u (z n X.) = z n(u x.) = z n x = z 
=1 ^ 1=1 ^ i=i ^ 


ALsoj for i ^ j, 

z.n z^ = (z n xp n (z n x^) 
z n (x.n Xj) = z n = 

Eliminating any Z^= 0, Z^= [Z^] is in normal form. Since 
X^= Xg, Z^= Z , Therefore A and B are similar on Z. 

(Intersection) XflY ^. Using the first part 
of the proof, A and B are similar on X fl Y. QED 

Absolute Similarity 

Certain pairs of programs may be similar on all valid 
input sets. We have the following definition: 

Definition : A and B are absolutely similar iff 
A and B are similar on all valid input sets X. 

Lemma 2 : If A and B are similar on the maximal 
valid input set X, they are absolutely similar. 

Proof: All valid input sets are subsets of the maximal 
valid input set. By theorem 2, since A and B are similar on 
the maximal input set, they are similar on all valid input sets. 
Thus A and B are absolutely similar. QED 


Leirrma 3 : If A and B are absolutely similar, and if 
A and B are similar on X and on Y, they are similar on X U Y, 

Proof: The union of two valid input sets is also a 
valid input set. By hypothesis, A and B are similar on all 
valid input sets, QED 


We define the relational operators < and > for single 

A A 

inputs and input classes. 

Definition ; For x^^X2 valid inputs to A, x > x 

iff C(A,x^) > C(A,X2) and x^<^X2 iff C(A,x^) < C(A,x ), 

For X., X. equivalence classes of A, X^ >,X. iff 
i J i A J 

C(A,X ) > C(A,X.) and X <-X. iff C(A,X,) < C(A,XJ. 

Similarity tells us that for B similar to A, x =.x 

iff Xj^-gX^. It also implies that x^^/^x^ iff ^^^ x^. We cannot 
say that x-<.x implies x < x^j nor can we establish any other 
ordering relation on the complexities of inputs. To do this, 
we need to define another relation on programs with the same 
input set which will tell us something more about the relative 
magnitudes of the equivalence class complexities. This leads 
to the following definition: 


Deflnltlon : Suppose A and B are similar on X and we 

order the equivalence class complexities C(A,X.) and C(BjX ) 

^ J 

in strictly increasing order (strictly increasing because X 
and X^ are in normal form) to form two n-tuples (n= |X | ) : 

(C(A,X^), C(A,X2), ..., C(A,X^)) 
(C(B,X^). C(B,Xp, ..., C(B,X^)) 

Because X^= X„ we know that: 
A B 

m^e\) (3X^e Xg) IX^= X'] 
(VX^e Xg) (3XjG X^) [X^= Xj 

Then A and B are homomorphic on X iff (TI S i s: n) [X =: X* ] 

The main property of the homomorphic relation^ that 
the order of complexity is preserved in homomorphic programs, 
is shown in the following theorem: 

Theorem 3 : A and B are homomorphic on X iff (Vx.^x^e X) 
[x^ relop^ X2<=> x^ relop ^ x^] 
where relop e {>,=,<} and is the same in both cases. 


Proof: (=>) Let Xj^gX , X2eX.. Suppose C(A,X.) 
is in the i position of its ordered n-tuple (as above) and 
G(A,X ) is in the j position. By homomorphism, C(B,X ) and 
C(B,X ) are in the i — and j — positions of their n-tuple. Then 

C(A,X^) relop C(A,X,) iff C(B,X ) re lop C(B,X.) 

J J 

C(A,Xj^) relop c(A,X2) iff C(B,x ) relop C(B,k ) 
X- relop X iff x relop x 

where relop is the same operator in all cases. 

(<?=) By hypothesis, (Vx^,x^ eX) [at relop^ x^ iff 

^1 £ii2Eg ^2^' ^" particular, ^l'^A^2 ^^^ ^l'^B^2^ ^° ^^ theorem 
1, A and B are similar on X, Now assume that in the ordered 
n-tuples for C(A,X^) and C(B,Xp, (31) [X^?^ X^]. Let i be the 
smallest index for which this is true. Suppose the element 
corresponding to C(A,X ) is C(B,X ). We then have: 

(C(A,Xj^), C(A,X2). ..., C(A,Xj,) ...) 
(C(B,X^), C(B,X2), .... C(B,X^) ...) 

Since i is the smallest index, then X, could not have occurred 
within the first 1-1 complexities C(A,K ), k s i-1. Therefore, 
C(A,X.) > C(A,X.). Similarly, C(B,X ) > G(B,X.). If x e X., 

J *' X- J L 1. 

K^e X,, x^< X but X > X . Contradiction! Then our assumption 
that X 7^ X' was wrong. Thus A and B are homomorphic on X. QEI) 

The homomorphism relation gives us the information we 


need to determine the relative magnitudes of the equivalence 
class complexities of two programs. To see if homomorphism 
holds^ we need only examine the two sets of complexity equiva- 
lence classes and the associated complexities. If |X | « |x|^ 
the number of objects we will have to examine is small compared 
to the total number of inputs. 

As with similarity, homomorphism is preserved by 
subset and intersection of input sets: 

Theorem 4 : If A and B are homomorphic on X and on Y, 
they are homomorphic on Z Q X and on X n Y. 

Proof : (Subset) By theorem 1, A and B are similar on 
Z. We can order the equivalence class complexities of A and B 
on Z: 

(C(A,Z^), C(A,Z2), ..-, C(A,Z^)) 
(CCB.Z^), C(B,Z^), ..., C(B,Z^)) 

where Z.=X.n Z and Z'=X'n Z as in the proof of theorem 2. Since 
X.= X'j then Z.= t\, VZ.e Z (= 'z ) /. A and B are homomorphic on Z. 

(Intersection) XflY cX. Using the first part 
of this proof J A and B are homomorphic on X Pl Y. QED 


Homomorphism and Complexity Limits 

In Chapter III^ we Introduced the notion of maximum 
and minimum complexity on a set of Inputs. We noted that these 
quantities bound the resource usage of a program on a particular 
input set. Here, we formally define these two quantities and 
also two others. 

Definition ; Given a program A with input set X and 
X^ in normal form, we can define the following quantities: 

X^^(A) = that X, for which C(A,X,) = C „ (A,X) 

m*ix 1 t max 

X^^^(A) = that X^ for which C(A,X^) = C ^^CA^X) 

^max ^^^ \ln ^^^ unique by the normal form condition. They tell 

us which inputs will consume the greatest amount of the resource 

being measured and which inputs will consume the least. These 
two quantities are preserved by homomorphism: 

Lemma 4 : If A and B are homomorphlc on X, then 

■■■ ■''■■mW' ■;-7"!i. 

t: ^mw] 



Proof : SuppoM i„^^(A.) - X^. Tli«n 

(VX^eX^, V l^^^l'^A^j^ 

'')-■: s :y^.:^iir 

(^jS Xg, Xj?«X^)lX^'^ Xj] 

;l.i-^,v -■•••■i" 


Similarly, for X^^^CA) - X^^<B). QED 

We also state, without proof, the following 
which relates complexity limits to subMjtts of inpttt sets: 

,X) and 

^ if Z c X, thai C^,(A,2) s 

Thus c _ and C for the maximal valid input set limit the 
corresponding quantities for all other valid input sets. 

«:•; «. =J-. 


Absolute Homomorphlsm 

Analagous to absolute similarity, we can define two 
programs to be absolutely homomorphlc if they are homomorphic 
on all valid input sets. Results corresponding to lemmas 2 
and 3 can be shown for absolute homomorphism. 


We now define one last equivalence relation between 
programs with the same input set; this relation is stronger 
than homomorphism. 

Definition : A and B are isomorphic on X iff they are 
similar on X and 

(VX.e\)[C(A,X.) = C(B,X^)] 

Isomorphism is a special case of homomorphism:; the 
complexity associated with any equivalence class is the sa-u-e 
for both programs. As we would expect, isomorphism is preserved 
by subset and intersection of input sets; however^ it is also 
preserved by union of input sets. 

Theorem 5 : If A and B are isomorphic on X and on Y, 
they are also isomorphic on Z c X, X fl Y^ and X U Y. 

Proof : (Subset) By theorem 4, A and B are hociomorphic 
on Z. 

(VZ^G Z^) [C(A,Z,) = C(A,X^) = C(B,X.) = C(B,Z^)] 

W^---M-^-'''^" '^'^ 

'■w-^- T ■ ■■:■■ 

v^-:-iu 'bsj.:^-'y' '■■ ^.- 


(Intersection) Follows from etdiset proof. 
(Union) Let Z « X U t» First iffetttfcist show 

U { X^U Y. I X^.^Y 3 

Similarly, 2^- Z^j^U Z^\J Z^y By hypothesis, X - X^, Y « Y 

(VX^,Yj)[C(A,X^) . C(»,;f^) f <?(4,3Cj) * C(»^Yj) 
••• \l- Si •~* 2^ - Zb2 

since X^^l^.-r>^-^Tj,, 1^3- J^3,.. -^ - 



C(A,X^), Z^« X^ 

^^^^^.j^b;.. . ^. 

. /^ir-'-^v- n 

--^^. ^. teis i^ ^ -^-.o::-/': t« (:5^<^d^^E> ^|!E:2S. 


C(B,Z. ) = f C(BjX.), Z = X. 

^ J ILL 

^ C(B,Y.), Z^ = Y. 

(Note: if Z.= X.U Y., C(A,X^) = C(A,Y. ) and C(B,X.) = C(B,Y.)) 

But by hypothesis, C(A,X^) = C(B,X.) and C(A,Y. ) = C(B,Y, ) 
Therefore, C(A,Z^) = C(B,Zj, ^Z^e Z^. Thus A and B are isomorphic 
on X U Y. QED 

Isomorphism not only preserves X and X , but also 

max min 

*^max^ ^min^ ^"^ C(A,X); we state the following easy leimna without 

L^maa^: if A and B are isomorphic on X, then 

C(AjX) = C(B,X) if the methods used in selecting tlie 
inputs to A and B (i.e., Pr(X^)) are the same. 

We will return to isomorphism in the next section when 
we discuss concatenation of programs. 

4. Concatenation 

We will now investigate the complexity properties of 
programs which may be combined or concatenated to from larger 
programs. By the concatenation of two programs A and B, we 
mean the program which is formed by appending T; to the eud jf A, 


so that when the locus of execution reaches the enu of A^ it 
will enter B, We will assume that A places the outputs of its 
computation into certain registers and that these same registers 
are used by B as inputs. B is not requred to use ail of A's 
outputs as inputs: however^ B cannot need more inputs than A 
can supply. We will say that A is l/O- compatible to B if this 
restriction is obeyed. 

We will also assume that neither A or B is changed 
by the concatenation of B onto A (which will be denoted by 
*'A«b"). We run into at least one trouble spot: If A finishes 
execution by halting somewhere in the middle. To continue with 
the execution of B, we would need to change this HALT instruction 
into a BRANCH to the end of A, This modification would probably 
change A*s complexity with respect to certain inputs. To avoid 
this difficulty, we will assume that all programs teniiinate by 
executing their last instruction. Thus we rnay perform cunca- 
tenation without changing instructions in either r i ogram. 

If we wish to study the complexity of A»B for an input 
set Xj the inputs to B will have to be valid haltin- inputs. 
Therefore, A(X), the set of inputs to B, will always be a ^^alid 
input set. Given this fact, the complexity of A»B for any input 
X can be determined by linearity: 

C(A.B,x) - C(A,x) + C(B,A(x)) 



We now place another restriction on the concatenation 
of programs which will enable us to prove some properties on the 
complexity of concatenation. This restriction will be defined 
in two stages as follows: 

Definition : Let A be I/O- compatible to B and for X 

a valid input set to A, A(X) = Y is a valid input set to B. 

Then A is X^ - compatible to B iff for X^e X^ (in normal form), 

there exists Y.e Y_ (in normal form) such that A(XJ = F e Y 
J 15 i i j' 

X^- compatibility tells us that a set of inputs (X ) 
which, by definition have the same complexity for A (C(A,X )) 
will be transformed into a set of inputs (F ) for B which will 
all have the identical complexity - C(B,Y.). We now extend 
this relation to all equivalence classes of X: 

Definition ; A is X- compatible to B iff A is X,-con^atible 
to B for all X^e X^ (in normal form). 

If A is X-compatible to B, it is easy to see that for 
x-,x^e X, 

''1V2 => *rA.B^2 

However the converse is not necessarily true. If x-t^.x and 
A(Xj^) T^g A(X2), it may still be that \=^,J>^2- ^^ theorem 1, 


we conclude that A and A»B are not necessarily similar on X. 

Compatibility is preserved by several operations on 
input sets. We state the following two theorems without Including 
the proofs. 

Theorem 6 ; If A is X-compatible to B and also Y- 
compatible to B, and if Z ^ x, A is Z- compatible and X Y- com- 
patible to B. 

Theorem 7 ; If A is X-compatible to B and also Z- 

compatible to B, and if for A(X. ) c y. and A(Z. ) ^ T., 

X =.Z, => Y = T,j then A is X U Z-compatible to B. 
j[ A 1 1 B 1 

In the case of theorem 7, we need the additional 
restriction that (X.^.^ => Y,= T. ) because if two equivalence 

classes (X.,Z ) must be combined in the normal form of (X U Z) 

as a result of having the same complexity 0^j^A^,)f the images 
(Y.jT ) of these classes under A must also have the same 
complexity (Y ,%T.) so that A(X.U Z.) ^ Y, U T^ . 


Compatibility and Isomorphism 

To conclude the section on concatenation, we discuss 
some relationships between compatibility and isomorphism. 
We would like to see under what conditions isomorphic programs 
can be concatenated to yield programs which are still isomorphic. 
We first show that isomorphism is preserved by the concatenation 
of the isomorphic programs to a program which is compatible to 



Theorem 8 : If A is Y-compatible to B and to G^ and 
if B and C are isomorphic on X = A(Y)^ then A»B and A»C are 

isomorphic on Y. 

Proof: Let D = A^B^ E = A»C, We need to show that 

Y - Y^ and (VY. gY^) [C(D, Y. ) = C(E,Y.)], Y^ is composed of two 
u h 1 D 1 1 D 

subsets : 

V ^\'\\ (^j^v^Vd^j^ ^ 

U[Y.UYj Y.=^Yj^ } = Yd1^^D2 

Similarly, Y^= Y^^U Y^^ 

Since C(B,X^) = C(C,X ), VX.e IL, it must be that 

\r "^El ^""^ "^02= ^2- Therefore, Y^= Y^ 

Then (VY.e Y^^), 

C(D,Y. ) = (by linearity) 

C(A,Y,) + C(B,A(Y.)) = (by isomorphism) 
C(A,Y.) + C(C,A(Y.)) = (by linearity) 
.'. A«B and A«C are isomorphic on Y. QED 


Since A«B and A»C are isomorphic, we can continue the 
concatenation if we can find a program D which is Z-compatible 
to A»B and A-C and where D(Z) c Y. Then D.A'B and D'A«C will 
be isomorphic on Z. 

We now show the conditions under which the concatena- 
tion of pairs of isomorphic programs results in programs which 
are also isomorphic. 

Theorem 9 : Suppose A and B are isomorphic on X, 
A is X-compatible to C, B is X-compatible to D, C and D are 
isomorphic on Y = A(X) U B(X), Then A«C and B*D are isomorphic 
on X iff 

(VX.eX )OY,e Y )[A(XJ, B(X ) cy.] 

1- A J c I 1 J 

Proof : (<^) Let E ~ A»C, F = B»D. Then 

U [ X^U X. 1 X^=^X^ } = X^^U X^2 

Similarly, \= \^\J X^^ 

Since A(X^), B(X^) c y and C and D are isomorphic 
on Y, C(C,A(X.)) = C(C,Y.) = C(D,Y.) = C(D,B(X )). Also, 

C(A,X ) = C(B,X ), VX e X Thus, C(E,X. ) = C(F,X.), VX . 

^ i- 1 /i 1^ 1 i 


•■• ^r ^1, V ^2 ^"^ '^^"^ V \ 

Then (VX^eXg), 

C(E,Xj^) = (by linearity) 

C(A,X^) + C(C,A(X^)) = (by isomorphism) 

C(B,X^) + C(C,A(X^)) = (by hypothesis) 

C(B,X^) + C(C,Y.) = (by isomorphism) 

C(B,Xj^) + C(D,Yj) = (by hypothesis) 

C(B,X^) + C(D,B(X^)) = (by linearity) 

A»C and B»D are isomorphic on X. 

(=>) Given A(X^) £ Y , B(X ) C Yj^, we want 
to show that Y = Y^, For X^e X^, C(E,X^) = C(F,xp. By 
linearity, C(A,X^) + C(C,A(X^)) = C(B,X^) + C(D,B(X^)). But 
C(A,Xj^) = C(B,Xj^) so we must have that C(C,A(X )) = 
C(D,B(X^)); or C(C,Y ) = C(D,Y^). But since C and D are 
isomorphic on Y, then Y = Y_ (in normal form). Therefore, 
Yj= Y^^. QED 

5, Functional Equivalence 

It Is often the case that we wish to examine two 
programs which represent different algorithms for the same 
function. The programs compute the same output when given the 
same input from a specified input set; however, the method 


(algorithm) used to compute the result is not the same. This 
situation often arises when we wish to determine which version 
of a particular subroutine or program we should use. All 
versions represent (hopefully!) the same function, but one 
version may use less resources than another. Program equivalence 
may also arise in the area of simulation. If one program is 
simulating another^ and if the first program is also producing 
the same output as the second, then the programs are equivalent. 
Equivalence is defined here in terms of similar func- 
tional behavior on a specified input set. Output values are 
defined as transformations on input values. If output variables 
are not also inputs, their values must be expressable in terms 
of the inputs. 

Definition: A and B are equivalent on X iff 
(Vx G X)[A(x) = B(x)] 

It is easy to see that equivalence is closed under 
subset, union, and intersection: if A and B are equivalent 
on X and on Y, they are equivalent on Z £ X, X Y, and X U Y. Equiva- 
lence and concatenation are related by the following lemma: 

Lemma 7 : If A and B are equivalent on X, C and D 
equivalent on Y where A(X) c y; and if A, B are l/O- compatible 
to C, D then A»C, A»D, B«C, B»D are equivalent on X, 


Proof: By l/0-compatlblllty restriction, A»C, etc. 
may be correctly formed. Now for xeX, A(x) = B(x) = y. Since 
yeY, C(y) = D(y). Therefore, A.C(x) = B*D(x) and A.C and B.D 
are equivalent on X. Similarly for the other cases. qed 

To conclude this chapter^ we present the following 
theorem which relates isomorphism, compatibility and equivalence: 

Theorem 10 ; Suppose A and B are isomorphic and 
equivalent on X, A is X*compatible to C, B is X-compatible to D, 
C and D isomorphic on Y ^ A(X) ^ B(X) (since A and B equivalent). 
Then A«C and B»D are isomorphic on X. 

Proof t We look at theorem 9. Since A(x) = B(x), 
Vx G X, A(X^) = B(X^), VX^e X^. Since A is X-compatible to 
C, A(X^) C Y , Ye Y^. But by the isomorphism of C and D on 
Y, Ye Yjj. Then B(X^) c Y . Therefore, A*C and B*D are isomor- 
phic on X. QED 

If C and D are equivalent on Y in addition to being 
isomorphic, we can use lemma 7 to conclude that A»C and B*D 
are isomorphic and equivalent on X« 

Chapter VI. Conclusions and Suggestions for Further Study 


We have developed a general theory of computational 
complexity for computer programs. We have looked at complexity 
from the viewpoint of resource usage and regarded the use of 
different resources as different measures of the complexity of 
a program. Many complexity measures fit into our theory, the 
only requirement being that the usage of the associated 
resource obey the linearity principle. 

Our theory has been based upon observing the behavior 
of a program on a valid set of inputs. We have also relied 
extensively on an equation representation of the transformations 
which a program applies to its inputs. We have attempted to 
justify the use of finite input sets by noting that real pro- 
grams, running on real computers, are able to accept and 
manipulate only a finite number of distinct input values, due 
to software and hardware limitations. In addition, most pro- 
grams actually process only a subset of all possible input 
values. Our methods were shown to be valid for programs with 
a large number of programming linguistic constructs and with 
several different input schemes. 

The complexity analysis of a program produced a set 
of complexity equivalence classes and a corresponding set of 
class complexities. We have noted that most programs do not 


treat every input value differently, and therefore the number 
of equivalence classes will generally be smaller than the number 
of possible inputs. Thus, these two sets give a relatively 
compact representation of the complexity information for a program 
operating on a given input set. These sets immediately led to 
some complexity parameters for the program: 

1. C(A,X) - the expected complexity value on the set X 

2. C^^^(A,X) - the maximum complexity on the set X 

^' "^roax^^^ " ^^^^^ inputs which result in complexity C (A,X) 

^- C^^^(A,X) - the minimum complexity on the set X 

^' \in^^^ " ^^^^® inputs which result in complexity C (A,X) 

^max^^^^^ ^^^ ^min^^^^^ bound the resource usage of program A 
and tell how much of this resource A can possibly use and how 
much it must use to process inputs from set X. 

Our basic purpose has been to develop some theoretical 
tools for studying the difficulty of computing functions by 
observing the program implementations of these functions. 
While these techniques will work for any program of the tjrpes 
described, the necessary computations will become unmanageable 
if the program equation is complex or if the number of complexity 
equivalence classes is large. Thus, one could not expect to 
sit down and find the equation and equivalence classes of a 
PL/l compiler, just as one would be hard-pressed to prove that 
such a compiler is "correct*'. Our techniques will probably 
be most valuable in the analysis of small programs and also 


in deciding which version of a particular program will be most 

suitable for use; suitable in the sense of using the least 

resources over a particular set of inputs^ where we may wish 

to minimize C(A,X), C ^ (A,X), or C , (A,X). 

max min 

Programmers and systems analysts are often faced 
with this latter problem, particularly in a large programming 
system such as a language translator or operating system which 
is modularized and which has its basic components frequently 
replaced as the system is up-graded. The complexity analysis 
techniques allow different versions of program modules to be 
compared, perhaps in one case on the basis of maximum resource 
usage, in another with regard to average resource usage, in a 
third with respect to some combination of complexity parameters. 

Areas for Further Study 

We have used the concept of a program equation for 
much of our work. This equation is independent of a study 
of complexity. It gives a concise algebraic formulation of 
a program or algorithm. It is also valid on an infinite input 
set as long as all elements in this set are halting inputs for 
the program in question. The equation brings to light the 
transformational characteristics of a program. It should be 
useful in the study of other aspects of programs and programming 
languages, such as program semantics and program correctness. 


A subject which we have not examined but which is 
important is the use of space as a complexity measure. The 
use of space did not follow the linearity principle. An 
analagous constraint for space and its related measures would 
have to be devised to analyze the complexity of programs with 
respect to these resources. 

We have not discussed the effects of transformations 

of the program on the complexity. For example, suppose we make 

a well-defined modification to program A with input set X. 

Is there a well-defined effect on X., C (A,X), C , (A,X)? 

A max min 

Cooper's work on graph transformations [8] may be useful here. 
We have defined three equivalence relations between 

programs with the same input set. There may be other relations, 

intermediate in strength between similarity and isomorphism 
which reflect other programming situations. 

We have mentioned that it is advantageous for 

|X| » |X^1. In this case, X^ and { C(A,X^) ] X^e X^} give 
us a more economical representation of the complexity informa- 
tion for A than if we simply looked at all x in X. However, we 
have not explored the relationship between the relative size of 
X^ and the nature of A itself. Are there certain conditions for 
which we get this economical representation? 

Appendix - Mathematical Notation 


We make use of the notion of a set and various opera- 
tions upon it. A set is an unordered collection of objects 
and is named by a capital letter. The elements of a set are 
enclosed in braces and separated by commas. 

A subset of a given set is another set containing allj 
some^ or none of the elements of the original set. If Z is 
a subset of X^ we write Z C X. If a subset contains no elements^ 
it is called the empty set and denoted by 0. 

The size of a set X is written |x| and is simply the 
number of elements in X. 

The union of the sets X and Y^ denoted X U Y, is a 
set containing those elements which are in either X or Y or 

The intersection of sets X and Y^ denoted X H Y^ is 
a set containing those elements which are in both X and Y. 

We denote membership in a set by the symbol e. Thus, 
a e { a^b }. Similarly^ c £ [ a,b }. 

A set may be specified by describing the conditions 
for membership in the set rather than listing all of the members. 
We use the notation 

X = { X I "conditions" ) 
This may be read 'X is the set of all objects x such that the 

condttions are true". Thus the set 
Y = [ X I xeX^ and xgXg } 
describes Y as the intersection of X- and X« 

Ordered Pairs, Cross Products 

An ordered pair, denoted (x,y), is an object with 
two components. The order of the components is significant: 
(x,y) is distinct from (y,x) unless x = y, 

A set of ordered pairs can be formed from sets of 
single objects by the cross product operator. The cross pro- 
duct of the sets X and Y, denoted Xx^ is defined as follows: 

XxY « { (x,y) I xeX and yeY J 


We use two logical quantifiers in our notation. The 
universal quantifier, denoted V, may be read "for all". It 
is used to qualify the statement following it. Thus, 

(V^ e X)]x ^ y] 

means "for all x in X, x ^ y". 

The existential quantifier, 3, may be read "there 
exists". The statement 

(3x G X)[x 5 y] 


means "there exists an x in X such that x ^ y". 

Quantifiers may be grouped in a series to form more 
complex logical statements. We might have 

(Vx e X)(ay e Y)[x ^ y] 

which states that "for all x in X^ there exists a y in Y such 
that X ^ y". 

Functional Composition 

The composition of functions f and g, denoted f«g, is 
another function which transforms the domain of g into the 
range of f. Thus if g(x) = y and f (y) = 2, f»g(x) = z. Composi- 
tion may be continued for any number of functions. If a function 
f is composed with itself i times, we may abbreviate this as f . 

Implication and Equivalence 

If the truth of statement A implies the truth of 
statement B, we write 

A => B 

This may also be read "if A then B", Similarly if B implies 
A, we write 

B => A 


If A and B laply aadi othttt, thay art said to hB 
aquivalaiit and va vrlta 


■ ::>.'Tl^ m ■ ■ ^SiSa^- r*^/^# vv. 


t:r^, -^iv-;:-^#^3f*^^.*^^ 







:V/ .■ 


i- \ •= . 

^ ■ '^ ^■' . 



iJ^jad:>£jT ;^\^l^alc^fi3r> rsv^ol-^^^'-'^^^^ii- Jo^-^m^"-' 

Vb cam aUo irrlta aq^vitlaiMa aa 

iiii«i ^^'^ l0 an abbreviation for '^f and only If '^ 

i . ■.,.■■.. 

■ ^' ^ ii'*'^' 




1. HartmaniSj J. , and j/ E. Hopcroft, "An Overview of the 
Theory of Computational Complexity*'^ Technical 
Report No. 70-59, Department of Computer Science, 
Cornell University, April 1970. (See also JACM 
18(3), July 1971, for a later version of this report). 

2. Landin, P. J., ''The Mechanical Evaluation of Expressions", 

Computer Journal 6, 4 (January 1964) pp. 308-320. 

3. Blum, M. , "A Machine Independent Theory of the Complexity 

of Recursive Functions", JACM 14 (1967) pp. 322-336. 

4. Meyer, A. R. , and D. M. Ritchie, '*rhe Complexity of LOOP 

Programs", Proceedings of the 22— National ACM Conference . 
(1967), pp. 465-469. 

5. Ramamoorthy, C. V., "Discrete Markov Analysis of Computer 

Programs", Proceedings of 20 — National ACM Conference . 
(1965), pp. 386-392. 

6. Zeiger, H. P., 'iForroal Models of Some Features of Programming 

Languages", Proceedings of the 3 — Annual Princeton 
Conference on Information Science and Systems , (1969), 
pp. 425-429. 



7. DrakA, A., Fundaaantala of Applied Probability Theory, 

MW}r«w-Hlll Book Conpany (1967), N«r York, 

8. Cooptr, D. C, "Bom TransfonMtlou and Standard Foms of 

^aphs, with Applications to Gooputar Frograms", 
In Ifcchlna Intalllaanca 2, Dala and tfilcfala («d. ) 
iMrloan KlsavUr Publishing Coapany, Inc., (1968), 
Mw Tork. 


Security Classification 


(Smcurity ctaamiticmtion o/ titt; body of mbmtrmct mnd indexing mxnotmUon mu9t b« «nr*red when the ovmratt report i» ciaMsitimd) 

\. ORIGINATING ACTlvrTY (Corporate author) 

Massachusetts Institute o£ Technology- 
Project MAC 



2b. GROUP 



Complexity Measures for Programming Languages 

4. DESCRIPTIVE NOTES (Type of report and inclueiv 

Technical Memorandum 

5. AUTHOR(S) (Last name, tirat nmne, initial) 

Goodman, Leonard I. 


September 1971 



7b. NO. OF REFS 






9b. OTHER REPORT NOtS) (Any other number* that may be 
aeeigned thie report) 


Distribution o£ this document is unlimited. 




Advanced Research Projects Agency 

3D-200 Pentagon 

Washington, D.C> 20301 


A theory of complexity is developed for algorithms implemented in typical prograimning 
languages. The complexity of a program may be interpreted in many ways; a method for 
measuring a specific type of complexity is a complexity measure -- some function of the 
amount of a particular resource used by a program in processing an input. 

After the complexity of the basic program elements is determined, program complexity is 
analyzed with respect to single inputs and then with respect to finite sets of legiti- 
mate halting inputs. A program equation is developed to aid in the complexity analysis. 
Using this equation, an input set is partitioned into classes of constant complexity. 

Several equivalence relations are defined, relating different programs by their complex- 
ity. Complexity is also discussed in terms of concatenation and functional equivalence 
of program. 


Computational Complexity 
Program Resource Usage 

Complexity Measures 
Programming Languages 

Program Equations 

Program Equivalence Relations 

DD/nSei. 1473 (M.I.T.) 

Security Classification